6 research outputs found

    Noise-robust text-dependent speaker identification using cochlear models

    Get PDF
    One challenging issue in speaker identification (SID) is to achieve noise-robust performance. Humans can accurately identify speakers, even in noisy environments. We can leverage our knowledge of the function and anatomy of the human auditory pathway to design SID systems that achieve better noise-robust performance than conventional approaches. We propose a text-dependent SID system based on a real-time cochlear model called cascade of asymmetric resonators with fast-acting compression (CARFAC). We investigate the SID performance of CARFAC on signals corrupted by noise of various types and levels. We compare its performance with conventional auditory feature generators including mel-frequency cepstrum coefficients, frequency domain linear predictions, as well as another biologically inspired model called the auditory nerve model. We show that CARFAC outperforms other approaches when signals are corrupted by noise. Our results are consistent across datasets, types and levels of noise, different speaking speeds, and back-end classifiers. We show that the noise-robust SID performance of CARFAC is largely due to its nonlinear processing of auditory input signals. Presumably, the human auditory system achieves noise-robust performance via inherent nonlinearities as well

    Event-based feature extraction using adaptive selection thresholds

    Get PDF
    Unsupervised feature extraction algorithms form one of the most important building blocks in machine learning systems. These algorithms are often adapted to the event-based domain to perform online learning in neuromorphic hardware. However, not designed for the purpose, such algorithms typically require significant simplification during implementation to meet hardware constraints, creating trade offs with performance. Furthermore, conventional feature extraction algorithms are not designed to generate useful intermediary signals which are valuable only in the context of neuromorphic hardware limitations. In this work a novel event-based feature extraction method is proposed that focuses on these issues. The algorithm operates via simple adaptive selection thresholds which allow a simpler implementation of network homeostasis than previous works by trading off a small amount of information loss in the form of missed events that fall outside the selection thresholds. The behavior of the selection thresholds and the output of the network as a whole are shown to provide uniquely useful signals indicating network weight convergence without the need to access network weights. A novel heuristic method for network size selection is proposed which makes use of noise events and their feature representations. The use of selection thresholds is shown to produce network activation patterns that predict classification accuracy allowing rapid evaluation and optimization of system parameters without the need to run back-end classifiers. The feature extraction method is tested on both the N-MNIST (Neuromorphic-MNIST) benchmarking dataset and a dataset of airplanes passing through the field of view. Multiple configurations with different classifiers are tested with the results quantifying the resultant performance gains at each processing stage

    Neuromorphic engineering needs closed-loop benchmarks

    Get PDF
    Neuromorphic engineering aims to build (autonomous) systems by mimicking biological systems. It is motivated by the observation that biological organisms—from algae to primates—excel in sensing their environment, reacting promptly to their perils and opportunities. Furthermore, they do so more resiliently than our most advanced machines, at a fraction of the power consumption. It follows that the performance of neuromorphic systems should be evaluated in terms of real-time operation, power consumption, and resiliency to real-world perturbations and noise using task-relevant evaluation metrics. Yet, following in the footsteps of conventional machine learning, most neuromorphic benchmarks rely on recorded datasets that foster sensing accuracy as the primary measure for performance. Sensing accuracy is but an arbitrary proxy for the actual system's goal—taking a good decision in a timely manner. Moreover, static datasets hinder our ability to study and compare closed-loop sensing and control strategies that are central to survival for biological organisms. This article makes the case for a renewed focus on closed-loop benchmarks involving real-world tasks. Such benchmarks will be crucial in developing and progressing neuromorphic Intelligence. The shift towards dynamic real-world benchmarking tasks should usher in richer, more resilient, and robust artificially intelligent systems in the future

    Investigation of auditory nerve model and conventional approaches in noise-robust speaker identification

    No full text
    Automatic Speaker Identification (SID) is growing for the current demands of human-machine interaction in different fields, such as selfless driving vehicles, access to smartphones and laptops, and online security. These services become challenging while background noise is present. To achieve a noise-robust performance in adverse conditions, we propose two front-end feature extraction algorithms using the Auditory Nerve (AN) model. One algorithm uses energies of the Inner Hair Cell (IHC) response from the AN model. Another uses the energies of the linear chirp filter from the AN model followed by the cubic root and Discrete Cosine Transform (DCT). We investigate which algorithm is better in the SID task. We also use a modified Gammatone Filter Cepstral Coefficient (GFCC) as a reference. We tested these algorithms using text-dependent and text-independent speeches under clean and noisy conditions. This work shows that the performance of the proposed algorithms is way better than the previously proposed algorithm using the AN model. The algorithms with conventional nonlinearities significantly outperform the IHC algorithm in the noise-robust SID task. However, the application of conventional nonlinearities on the IHC algorithm provides a significantly improved SID performance

    Event-driven spectrotemporal feature extraction and classification using a silicon cochlea model

    No full text
    This paper presents a reconfigurable digital implementation of an event-based binaural cochlear system on a Field Programmable Gate Array (FPGA). It consists of a pair of the Cascade of Asymmetric Resonators with Fast Acting Compression (CAR-FAC) cochlea models and leaky integrate-and-fire (LIF) neurons. Additionally, we propose an event-driven SpectroTemporal Receptive Field (STRF) Feature Extraction using Adaptive Selection Thresholds (FEAST). It is tested on the TIDIGTIS benchmark and compared with current event-based auditory signal processing approaches and neural networks

    A binaural sound localization system using deep convolutional neural networks

    No full text
    We propose a biologically inspired binaural sound localization system using a deep convolutional neural network (CNN) for reverberant environments. It utilizes a binaural Cascade of Asymmetric Resonators with Fast-Acting Compression (CAR-FAC) cochlear system to analyze binaural signals, a lateral inhibition function to sharpen temporal information of cochlear channels, and instantaneous correlation function on the two cochlear channels to encode binaural cues. The generated 2-D instantaneous correlation matrix (correlogram) encodes both interaural phase difference (IPD) cues and spectral information in a unified framework. Additionally, a sound onset detector is exploited to generate the correlograms only during sound onsets to remove interference from echoes. The onset correlograms are analyzed using a deep CNN for regression to the azimuthal angle of the sound. The proposed system was evaluated using experimental data in a reverberant environment, and displayed a root mean square localization error (RMSE) of 3.68° in the −90° to 90° range
    corecore